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Outline 

Below  is  an  outline  summary  of  the  main  scientific  results  that  have  been  funded  within  this  project: 

1.  Distributed  algorithms  for  positioning  and  low-rank  approximation. 

Publications  [KMOll,  MOlO]. 

2.  Positioning  via  convex  optimization. 

Publications  [JMll,  JM13c]. 

3.  Approximate  message  passing  algorithms. 

Publications  [DMMll,  BMll,  BLM12,  DJM13,  DGM13,  JM12]. 

4.  Finding  highly  connected/atypical  subnetworks. 

Publications  [DM13]. 

5.  Assessing  uncertainty  in  high  dimensional  statistics. 

Publications  [JM13b,  JM13a]. 

The  main  collaborators  in  this  research  have  been  Mohsen  Bayati  (Stanford  University),  Yash 
Deshpande  (graduate  student,  Stanford  University),  David  Donoho  (Stanford  University),  Morteza 
Ibrahimi  (graduate  student,  Stanford  University),  Adel  Javanmard  (graduate  student,  Stanford  Uni¬ 
versity).  Satish  Korada  (postdoc,  Stanford  University).  The  work  of  Deshpande,  Ibrahimi,  Javan¬ 
mard,  Korada  was  partially  supported  through  this  grant. 

All  publications  are  available  through  on  leading  journals/conference  proceedings.  Publications 
under  review  are  made  available  online  through  arxiv  and  through  the  Pi’s  webpage.  The  next 
sections  provide  pointers  to  the  main  results. 

Distributed  algorithms  for  positioning  and  low-rank  approximation 

The  basic  question  in  matrix  completion  is  to  infer  a  large  low-rank  matrix  from  a  small  subset  of 
its  entries.  Positioning  refers  to  the  task  of  inferring  the  locations  of  n  points  from  a  subset  of  their 
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Figure  1:  Success  probability  of  OPTSPACE  POSITIONING  as  a  function  of  the  measurement  range 
ro,  for  various  network  sizes:  n  nodes  are  placed  uniformly  in  the  unit  square  [1, 1]^.  On  the  right, 
ro  is  divided  by  the  connectivity  scale  r(n)  =  (log  n)/n.  The  vertical  line  marks  the  onset  of 
connectivity. 


distance.  It  turns  out  that  positioning  can  be  viewed  as  a  matrix  completion  problem,  although  of 
a  peculiar  type.  The  paper  [MOlO]  develops  an  algorithm  for  positioning  using  ideas  from  matrix 
completion,  cf.  Fig.  1.  A  distributed  implementation  is  also  demonstrated. 

Many  algorithms  that  compute  positions  of  the  nodes  of  a  wireless  network  on  the  basis  of 
pairwise  distance  measurements  require  a  few  leading  eigenvectors  of  the  distances  matrix.  One 
example  is  MDS-MAP.  While  eigenvector  calculation  is  a  standard  topic  in  numerical  linear  algebra,  it 
becomes  challenging  under  severe  communication  or  computation  constraints,  or  in  absence  of  central 
scheduling. The  paper  [KMOll]  investigates  the  possibility  of  computing  the  leading  eigenvectors  of  a 
large  data  matrix  through  gossip  algorithms.  A  new  algorithm  is  proposed  that  amounts  to  iteratively 
multiplying  a  vector  by  independent  random  sparsification  of  the  original  matrix  and  averaging 
the  resulting  normalized  vectors.  This  can  be  viewed  as  a  generalization  of  gossip  algorithms  for 
consensus.  The  algorithms  outperform  state-of-the-art  methods  in  a  communication-limited  scenario. 

Positioning  via  convex  optimization 

In  presence  of  noise,  maximum  likelihood  localization  is  a  hard  non-convex  optimization  problem. 
The  papers  [JMll,  JM13c].  propose  a  reconstruction  algorithm  based  on  semidefinite  programming. 
This  is  the  first  algorithm  of  this  type  for  which  tight  performance  guarantees  have  been  proved. 
For  a  random  geometric  graph  model  and  uniformly  bounded  noise,  these  papers  establish  a  precise 
characterization  of  the  algorithm’s  performance.  In  particular,  in  the  noiseless  case,  there  exists 
a  connectivity  radius  ro  beyond  which  the  algorithm  reconstructs  the  exact  positions  (up  to  rigid 
transformations).  In  the  presence  of  noise,  the  papers  establish  upper  and  lower  bounds  on  the 
reconstruction  error  that  match  up  to  a  factor  that  depends  only  on  the  dimension  d,  and  the 
average  degree  of  the  nodes  in  the  graph. 
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Figure  2:  Left:  phase  transitions  for  several  compressed  sensing  schemes.  Scheme  I:  Standard  com¬ 
pressed  sensing  with  dense  partial  Fourier  matrices  and  convex  optimization-based  reconstruction. 
Scheme  II:  dense  partial  Fourier  matrices  and  Bayes-optimal  AMP  reconstruction.  Scheme  III:  ‘spa¬ 
tially  coupled’  partial  Gabor  matrices  and  Bayes-optimal  AMP  reconstruction.  Right:  Evolution  of 
the  mean  square  reconstruction  error  across  the  signal,  as  AMP  iteration  proceeds. 


Approximate  message  passing  algorithms 

Approximate  message  passing  (AMP)  algorithms  were  developed  in  [DMM09]  as  a  way  to  solve 
certain  compressed  sensing  reconstruction  problems.  The  basic  idea  is  to  define  a  graphical  model 
that  is  associated  with  the  problem  of  interest,  and  to  apply  methods  for  approximate  inference  in 
graphical  models,  and  in  particular  message  passing  algorithms  inspired  by  belief  propagation.  Often 
the  resulting  graph  is  dense  which  is  at  odds  to  the  standard  wisdom  suggesting  that  this  class  of 
algorithms  is  most  effective  on  sparse  graphs. 

It  was  soon  realized  that  the  same  approach  can  be  applied  to  a  host  of  other  statistical  estimation 
problems  (see  [Monl2]  for  a  brief  overview  and  next  section  for  a  specific  example).  Further,  the 
theory  developed  in  [DMM09,  DMMll,  BMll,  BLM12]  provides  a  sharp  asymptotic  analysis  of  such 
algorithms.  This  analysis  shows  that  AMP  is  extremely  effective  on  some  classes  of  dense  graphs 
and,  furthermore,  dense  graphs  bring  along  special  simplifications  that  can  reduce  the  computational 
complexity  with  respect  to  sparse  cases. 

Compressed  sensing  with  ‘spatially  coupled’  sensing  matrices  provides  a  spectacular  application 
of  this  approach.  The  papers  [DJM13,  JM12]  show  that  such  a  scheme  can  effectively  solve  the 
reconstruction  problem  with  undersampling  rates  close  to  the  fraction  of  non-zero  coordinates.  For 
sparse  signals,  i.e.,  sequences  of  dimension  n  and  k{n)  non-zero  entries,  this  implies  reconstruction 
from  k{n)  +  o{n)  measurements.  For  ‘discrete’  signals,  i.e.,  signals  whose  coordinates  take  a  fixed 
finite  set  of  values,  this  implies  reconstruction  from  o(n)  measurements.  The  result  is  robust  with 
respect  to  noise  and  does  apply  to  non-random  signal. 

This  phase  transition  phenomenon  survives  for  ‘spatially  coupled’  matrices  with  considerable 
amount  of  structure.  In  particular,  the  paper  [JM12]  studies  the  problem  of  reconstructing  signals 
that  ase  sparse  in  Fourier  domain,  from  subsampled  Gabor  transform.  The  results  are  illustrated  in 
Fig.  2. 
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Finding  highly  connected/atypical  subnetworks 

Numerous  modern  data  sets  have  network  structure,  i.e.  the  dataset  consists  of  observations  on 
pairwise  relationships  among  a  set  of  n  objects.  A  recurring  computational  problem  in  this  context 
is  the  one  of  identifying  a  small  subset  of  ‘atypical  observations  against  a  noisy  background.  The 
motivation  can  be  -for  instance-  to  find  a  tightly  connected  small  community  in  a  large  social 
network. 

The  paper  [DM13]  develops  a  new  type  of  algorithm  and  analysis  for  this  problem.  In  particular, 
the  new  algorithm  improves  over  the  best  methods  for  nding  a  hidden  clique  in  an  otherwise  random 
graph,  a  special  problem  that  attracted  substantial  interest  within  theoretical  computer  science. 

The  new  algorithm  is  based  on  a  different  philosophy  with  respect  to  previous  approaches  to  the 
same  problem.  It  aims  at  estimating  the  hidden  set  by  computing,  for  each  vertex  in  the  network, 
the  posterior  probability  that  it  belongs  to  the  hidden  set,  given  the  observed  data. 

This  is,  in  general,  an  intractable  problem.  We  therefore  consider  a  message  passing  algorithm 
derived  from  belief  propagation,  a  heuristic  machine  learning  method  for  approximating  posterior 
probabilities  in  graphical  models.  We  develop  a  rigorous  analysis  of  this  algorithm  that  is  asymptot¬ 
ically  exact  as  N  ^  oo,  and  prove  that  indeed  the  algorithm  converges  to  the  correct  set  of  vertices 
if 


MS\>^{l+e).  (1) 

Here  S  is  the  hidden  set,  with  size  \S\,  X  quantifies  the  difference  between  connections  within  and 
without  the  subnetwork,  and  finally  e  is  an  arbitrary  positive  number. 

Assessing  uncertainty  in  high  dimensional  statistics 

Fitting  high-dimensional  statistical  models  often  requires  the  use  of  non-linear  parameter  estimation 
procedures.  As  a  consequence,  it  is  generally  impossible  to  obtain  an  exact  characterization  of 
the  probability  distribution  of  the  parameter  estimates.  This  in  turn  implies  that  it  is  extremely 
challenging  to  quantify  the  uncertainty  associated  with  a  certain  parameter  estimate.  Concretely,  no 
commonly  accepted  procedure  exists  for  computing  classical  measures  of  uncertainty  and  statistical 
significance  as  confidence  intervals  or  p- values. 

The  papers  [JM13b,  JM13a]  consider  a  broad  class  regression  problems,  and  propose  an  efficient 
algorithm  for  constructing  confidence  intervals  and  p- values.  The  resulting  confidence  intervals  have 
nearly  optimal  size.  When  testing  for  the  null  hypothesis  that  a  certain  parameter  is  vanishing,  the 
new  method  has  nearly  optimal  power. 

The  new  approach  is  based  on  constructing  a  ‘de-biased’  version  of  regularized  M-estimators. 
The  new  construction  improves  over  recent  work  in  the  field  in  that  it  does  not  assume  a  special 
structure  on  the  design  matrix. 
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